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plausible point of intersection, said providing step for further attempting said finally 
separating step. 

REMARKS 

The foregoing amendments to the specification and claims under Article 41 
of the Patent Cooperation Treaty place the application into a form for prosecution 
before the U.S. Patent and Trademark Office under 35 U.S.C. §371. Accordingly, 
entry of these amendments before examination on the merits is hereby requested. 
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SPECIFICATION 

TITLE 

METHOD FOR CHARACTER SEPARATION IN TEXT RECOGNITION TASKS 

BACKGROUND OF THE INVENTION 

5 Field of the Invention M e thod for charact e r s e parat i on i n t e xt recogn i t i on taaka 

The invention relates to a method for character separation in text recognition 

tasks. 

Description of the Related Art 

g In the automatic recognition of texts, that is to say when converting the graphic 

H information of a document into text characters which can be further processed by 

means of electronic text processing programs, an essential precondition for a 
^ successful recognition operation is the precise determination of the position and the 
?t[ size of the individual characters. In the case of originals with poor lettering or fonts with 
L;j a very narrow character space, this determination is problematic, inter alia, in that the 
0 characters are interconnected and "grow together", and can therefore no longer be 

separated using conventional methods such as simple contour tracking. 



SUMMARY OF THE INVENTION 

The invention is therefore based on the object of specifying an improved method 
for separating interconnected characters. 
20 This is performed according to the invention with the aid of a method of 

th e typ e m e ntion e d at the b e g i nn i ng, in which possible points of intersection are 
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determined in relation to the extraction objects under examination by means of white 
space analysis and angle analysis, in which plausible separating lines are determined 
from the points of intersection and corresponding mating points, and in which objects 
separated in such a way are subjected to a classification process and the final 
separation is performed on the basis of the results. 

A refinement of the method in which when there are more than three possible 
points of intersection, a first section is performed through the point of intersection 
selected fourth from the left-hand start of the 

character is advantageous. The reason for this is because no conventional text 
character of the Latin script has more than three white spaces. 

It is also favorable when after a first section with a first possible point of 
intersection and a subsequent unsuccessful attempt at classification, the left-hand 
neighboring point of intersection situated closest to the first possible point of intersection 
is provided as basis for a further attempt at separation. 

T he i nv e nt io n i s e x p lai ned in m o r e d eta i l w i t h the ai d of fi g u r es i n wh i ch, b y 
way o f e xamp le : BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows an illustration relating to the white space analysis of an images 
and 

Figure 2 shows an illustration relating to the actual character separation. 



DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 




The sequence of the method according to the invention is as follows: 
The method is started in the recognition operation after the determination of the 
position of the line. A white space analysis is already carried out when determining the 
circumference of a character or a plurality of connected characters by contour tracking. 
5 An angle analysis is performed after the complete contour is available. 

White space analysis and angle analysis are used to determine possible points 
of intersection, which supply possible separating lines in conjunction with mating points. 

The points of intersection are examined with regard to their p l aus i b i l i ty 
plausibility/possibility . Which character sequences contain the present combination 
j|f) of white spaces is determined in the process. Thus, for example, the following white 
M= spaces are contained in the letter sequence WV: TOP-BOTTOM-TOP-BOTTOM-TOP. 
H= Here, TOP (BOTTOM) characterizes the white space which is open at the top (bottom). 
*r\ The knowledge of the letters is now used to perform the first separation through the 
1/1 point of intersection of the fourth white space. 

jl5 It is determined thereupon to what extent the separation of the object along the 

separating lines touching at plausible points of intersection leads to plausible 
classification results. In other words, the separated characters or parts of characters are 
subjected to a recognition operation, for example by means of neural network and the 
separation is accepted if this operation leads to a satisfactory result - a character 

20 recognized with high reliability. Otherwise, the separation is repeated along other 
separation lines until there is a satisfactory result. 
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Neural networks are mathematical models which simulate the structure of the 
human brain. They comprise neurons, which are essentially summing elements with 
weighted inputs and a nonlinear amplifier component which are combined to form a 
parallel network having typically two levels. A detailed description of the feed forward 
neural networks used in the exemplary embodiment is to be found, for example, in 
"Layered Neural Nets for Pattern Recognition", B. Widrow, R.G. Winter, R.A. Baxter; 
IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 36, No. 7, July 
88. 

Pattern recognition by means of a neural network is performed using the method 
described in "A rotation, scaling, and translation invariant pattern classification system", 
C. Yuceer, K. Oflazer; Pattern Recognition, Vol. 26, No. 5, pp. 687-710, 1993. 

The white space analysis is described in more detail with the aid of figure 1 . The 
figure shows the two interconnected letters r and f, which have a white space W. Here, 
white space W means a white interspace bounded on three sides which has a certain 
depth and whose open side is directed upward or downward. This white space W is 
determined in the tracking of the contour of the character (which has grown together) 
when the contour line C transgresses two prescribed threshold values SW in both 
directions. If, as in the example, there is a white space W which is open downward, the 
highest point of the contour line C is defined as a possible point of intersection S, this 
being the lowest point in the case of a white space which is open upward. 

The sequence of the angle analysis performed thereupon is as follows: 
two vectors for which it holds that: 




A = C [i]C[i - 5] and B = C[i]C[i + 5] 

are determined from in each case three points on the contour line C[i]. 

The angle between the two vectors is calculated. The angle is entered into a list 
if it is right-to-left with an absolute value of less than 80° and a vertex (C[i]) either 
5 upward or downward. 

If this condition is fulfilled for a plurality of juxtaposed vector pairs, only the angle 
with the smallest absolute value is tracked further. 
fn The angles entered in the list are now examined as to whether an angle of 

GO opposite orientation to the vertex is present on the opposite side of the contour line. If 
01 o this is the case, the angle pair formed thereupon is stored as the position of a possible 
^ point of intersection. 

= n The sequence in the determination of the angle between two vectors which are 

ry defined by the three points from the contour line (C^xl/yl, C 6 :x2/y2, C^mx/my) is 
□ described below. The x and y components of the two vectors are determined therefrom. 
1 5 Ax = x1~mx; Ay = y1-my; Bx=x2-mx; By=y2-mx. 

The angle between the vectors A and B is calculated as follows: firstly, the angle 
of A to the x-axis is determined, and then the angle B to the x-axis. 
Angle = arccos ^ 

VC4* 2 )+(4> 2 ) 
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Angle (in degrees) = Angle (xnRad) * 1 80 

Pi 



Angle = 360-angle B+angle A (if the angle is greater than 360°, the angle is 
corrected by 360°). 

The determination of the direction of the angle vertex is based on the 
consideration that in the case of a downward directed vertex the y-coordinates of the 
points C 1 and C 6 are smaller than the y-coordinate of C 1V 

In the case of an upwardly directed vertex, the y-coordinates of the points C 1 and 
C 6 must be greater than the y-coordinate of C u . 

The characteristics of the printed text and the influence of the limited image 
resolution necessarily mean that, as a function of the space under consideration, in the 
region of a kink in the contour of a character that the angles, determined in the way 
described, between 2 vectors 

firstly become increasingly smaller and thereafter continuously increase again. 
Consequently, only the respectively minimum angle of such a range is used for the 
further evaluation. 

In order to fix a possible separating line, it is now necessary to determine for 
each possible point of intersection C(Nr) a corresponding mating point on the opposite 
branch of the contour line C(i);i=(0,., contour Nr). 

For this purpose, a straight line is laid through two points C(Nr-1) and C(Nr+1) 
adjacent to the possible point of intersection C(Nr) on the contour line, and the normal 



to this straight line is determined. The points adjacent to the point of intersection of this 
normal with the opposite branch of the contour line are examined with regard to their 
spacing value from the possible point of intersection and the normal, and the contour 
point with the minimum spacing value is defined as mating point C(g), and thus as 
second point of the possible separating line. The mathematical definition of this 
operation is as follows: 
nx=C(Nr+1)x-C(Nr-1)x 
ny=C(Nr+1)y-C(Nr-1)y 



= yj(C(Nr)x-C(i)x) 2 +(C(Nr)y-C(i)yf 



Spacing 
spacing relative to g 2 

= abs 



1 nx*{C(i)x-C{Nr)x)+ny*(C(j)y^ 



spacing value = spacing + spacing relative to g 2 ; 
C(g) = C(i) I spacing value (C(g), C(Nr)) = min 

The actual separation is explained with the aid of figure 2, the basis of the 
separation is the contour line of the extracted character. In a first step, a separating line 
buffer is initialized with 0, and this corresponds to a perpendicular line at the left-hand 
edge, and thereafter the point on the contour line 1 between 0 and the point of 
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intersection (the X-value maximum) on which the separation is based which is situated 
furthest to the right is determined. The point on the branch (the x-value maximum) of 
the contour line from the mating point up to the end of the contour 2 and the separating 
line 3 which is situated furthest to the right is also determined. 

5 The maximum x-values collected therefore constitute the extreme right-hand 

edge of the character used for the classification. 

Although other modifications and changes may be suggested bv those 
skilled in the art, it is the intention of the inventors to embody within the patent 

Q warranted hereon all changes and modifications as reasonably and properly 

fit come within the scone of their contribution to the art. 
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ABSTRACT OF DISCLOSURE 

A method for character separation in text recognition tasks using plausible 

points of intersection of the extraction objects under separation analysis via 
white space analysis and angle analysis. From the points of intersection and 
corresponding mating points a determination of plausible separating lines are 
made. Thereafter, the objects separated are classified via a classification 
process. The final separation performed based on the result of the classification 
process. 
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REVISION LIST 



The bracketed numbers refer to the Page and Paragraph for the start of the paragraph 
in both the old and the new documents. 

[1:1 1:1] Add Paras "SPECIFIC ... of the Invention" 

[1:1 1:6] Del Para "Method for character ... recognition tasks" 

[1:3 1:7] Add Para "Description of the Related Art" 

[1:4 1:9] Add Para "SUMMARY OF THE INVENTION" 

[1:5 1:11] Changed "method of ... beginning, in" to "method in" 

[2:3 2:3] Changed "The invention ... of example:" to "BRIEF DESCRIPTION OF 

THE DRAWINGS" 
[2:4 2:4] Changed ", and 
" to ". " 

[2:5 2:5] Add Para "DETAILED DESCRIPTION ... PREFERRED EMBODIMENT" 

[2:9 3:3] Changed "plausibility" to "plausibility/possibility" 

[7:2 8:2] Add Paras "Although other ... classification process." 
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